Skip to content

feat(csharp/src/Drivers/Apache): enhance GetColumns with BASE_TYPE_NAME column #2695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

eric-wang-1990
Copy link
Contributor

@eric-wang-1990 eric-wang-1990 commented Apr 10, 2025

Summary

This PR enhances the C# ADBC driver's GetColumns functionality by adding a new BASE_TYPE_NAME column to the result set. This column provides the base type name without parameters or decorations, making it easier for clients to identify the fundamental data type regardless of its parameterization.

Proposed Changes

  • Added a new BASE_TYPE_NAME column to the GetColumns result schema
  • Enhanced the HiveServer2Statement class to process and populate the BASE_TYPE_NAME column
  • Changed visibility modifiers for several methods from protected to public or internal to enable the enhancement
  • Added support for the VARIANT SQL type in the SqlTypeNameParser
  • Created comprehensive unit tests to verify the functionality with various data types

Implementation Details

The implementation:

  1. Intercepts GetColumns query results
  2. Processes each row to extract the base type name using the existing SetPrecisionScaleAndTypeName method
  3. Adds the base type name as a new column in the result set
  4. Preserves and potentially enhances precision and scale information for parameterized types

Benefits

  • Simplifies client code that needs to identify base types (e.g., all DECIMAL types regardless of precision/scale)
  • Maintains compatibility with existing code as this is an additive change
  • Improves developer experience by providing clearer type information

Testing

Added comprehensive unit tests that verify:

  • The presence of the BASE_TYPE_NAME column in GetColumns results
  • Correct base type extraction for simple types
  • Correct base type extraction for complex types (DECIMAL, INTERVAL, MAP, ARRAY, STRUCT)
  • Precision and scale preservation for DECIMAL types
  • Proper handling of type aliases (INT vs INTEGER)

@github-actions github-actions bot added this to the ADBC Libraries 18 milestone Apr 10, 2025
int rowCount = HiveServer2Reader.GetRowCount(rowSet, columnCount);

// Get metadata again to ensure we have the latest
TGetResultSetMetadataResp metadata = await HiveServer2Connection.GetResultSetMetadataAsync(OperationHandle!, Connection.Client, cancellationToken);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a duplicate call with the call to GetResultSetSchemaAsync. Is there any way to avoid the extra call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this, changed

Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per Bruce's comment, the else clause of GetColumnsAsync should be modified to call GetResultSetSchemaAsync only once and then compute the Arrow schema based on the result of that call by using SchemaParser.GetArrowSchema.

Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Please clean up a little extra whitespace that the linter doesn't like and then this can be checked in.

@CurtHagenlocher CurtHagenlocher merged commit 01fe15a into apache:main Apr 21, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants